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Unlike nuclear localization signals, there is no obvious 
consensus sequence for the targeting of proteins to the 
nucleolus. The nucleolus is a dynamic subnuclear struc- 
ture which is crucial to the normal operation of the 
eukaryotic cell. Studying nucleolar trafficking signals is 
problematic as many nucleolar retention signals (NoRSs) 
are part of classical nuclear localization signals (NLSs). In 
addition, there is no known consensus signal with which 
to inform a study. The avian infectious bronchitis virus 
(IBV), coronavirus nucleocapsid (N) protein, localizes to 
the cytoplasm and the nucleolus. Mutagenesis was used 
to delineate a novel eight amino acid motif that was 
necessary and sufficient for nucleolar retention of N pro- 
tein. and colocalize with nucleolin’ and_ fibrillarin. 
Additionally, a classical nuclear export signal (NES) func- 
tioned to direct N protein to the cytoplasm. Comparison 
of the coronavirus NoRSs with known cellular and other 
viral NoRSs revealed that these motifs have conserved 
arginine residues. Molecular modelling, using the solu- 
tion structure of severe acute respiratory (SARS) corona- 
virus N-protein, revealed that this motif is available for 
interaction with cellular factors which may mediate 
nucleolar localization. We hypothesise that the N-protein 
uses these signals to traffic to and from the nucleolus 
and the cytoplasm. 
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The nucleolus is a dynamic subnuclear structure involved 
in ribosome subunit biogenesis, in RNA processing, in cell 
cycle control and as a sensor for cell stress (1-4). 
Morphologically, the nucleolus can be divided into fibrillar 
centre(s) (FC), a dense fibrillar component (DFC) and an 
outer granular component (GC). A directed proteomic 
analysis, followed by subsequent bioinformatic analysis 
revealed that the nucleolus is composed of at least 700 


proteins (5-7). Whilst the rules and signals governing the 
nuclear localization and nuclear export of proteins are well 
defined, those concerning nucleolar localization/retention 
are not. In the case of nuclear localization signals (NLSs), 
these can be classified into several categories. The major- 
ity of motifs identified thus far such as ‘pat4'’, ‘pat7’ and 
bipartite signals are composed of basic amino acids within 
a given sequence length (8-10). Protein nuclear export 
signals (NESs) again vary, but one of the most common 
and well characterized Is an approximately 11 amino acid 
leucine-rich signal, typified by LxxxLxxLxxL, where a num- 
ber of hydrophobic amino acids can substitute for L, and 
the spacer regions (x) can vary In number (9,11). 


In contrast to the NLSs and NESs, nucleolar localization/ 
retention signals (NORSs) and pathways are not well char- 
acterized (12), and the signals can vary, but are usually rich 
in arginine and lysine, although there is no obvious 
consensus. For example, the MORKPTIRRKNLRLRRK 
motif identified in survivin-deltaEx3 protein (13) and the 
RSRKYTSWYVALKR motif of the 18-kDa fibroblast growth 
factor-2 (14). Nucleolar localization can also be regulated 
by binding accessory proteins, such as nucleostemin 
binding to GTP (15). Compared with NESs where the 
leucine-rich export signal (for CRM-1 type nuclear export 
receptors) is accessible for interaction with carrier proteins 
(16), the structural context of a NoRS Is not well char- 
acterized. In many cases, proteins localizing to both the 
cytoplasm/nucleus/nucleolus contain multiple signals to 
determine their subcellular localization (17-20). This high- 
lights the difficulty in identifying NoRSs, in that, many 
proteins which localize to the nucleolus also localize to 
the nucleus and contain both classical nuclear and nucleo- 
lar signals which can also overlap (14,21,22). 


We use the avian infectious bronchitis virus (IBV), corona- 
virus nucleocapsid (N) protein, as a model to study nucleo- 
lar retention. We have previously shown that N protein is 
present in the cytoplasm but can also actively localize to 
the nucleolus (23-25) and interact with the nucleolar pro- 
teins nucleolin and fibrillarin (26). Coronaviruses can also 
interact with host cell processes, such as the cell cycle 
(27,28) and signal transduction pathways (29,30). 
Coronaviruses together with the closely related arteri- 
viruses are nidoviruses, a group of positive strand RNA 
viruses, which cause a variety of different diseases 
(31,32). For example, IBV and severe acute respiratory 
syndrome coronavirus (SARS-CoV) both cause respiratory 
disease (33,34), whereas murine coronaviruses can cause 
hepatitis and demyelination (85) and porcine corona- 
viruses, respiratory and gastroenteric diseases (36). 
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IBV (Beaudette strain) N protein, composed of 409 amino 
acids with a predicted molecular weight of 45 kDa, is a 
phosphoprotein which can bind viral RNA with high affinity 
(37) and also modulate cellular processes (24,26). On the 
basis of amino acid sequence comparison, three con- 
served regions (1,2 and 3) have been identified in the 
coronavirus N protein, which in the case of IBV N protein 
can map to the N-terminal (amino acids 1-133), central 
(amino acids 134-265) and C-terminal parts of the protein 
(amino acids 266-409), respectively (38). Also, mass spec- 
troscopy revealed that conserved phosphorylation sites 
are present In regions 2 and 3 of both avian and porcine 
coronavirus N proteins (37,39). Given the subcellular loca- 
lization of IBV N protein to the nucleolus, but not nucleus, 
we hypothesized that the protein would contain a unique 
NoRS and possibly a signal for export of the protein into 
the cytoplasm. Therefore, using a combination of deletion 
and substitution mutagenesis, coupled with live cell ima- 
ging and confocal microscopy, we have tested this 
prediction. 


Results 


Bioinformatic analysis and preliminary molecular 
investigation of nuclear import and export signals in 
IBV N protein 

To identify whether there were potential NLSs (which 
could form part of a NoRS) and/or NESs in IBV N protein, 
we first conducted a bioinformatic analysis of the protein 
using existing motif prediction algorithms. PredictNLS (8) 
and PSORTII (40) were used to identify potential NLSs, 
and the NES predictor (NetNES) was used (16) to identify 
potential NESs. PredictNLS found no NLSs, whereas 
PSORTII indicated that IBV N protein contained two poten- 
tial overlapping NLSs in region 3 between residues 358-— 
366, a pat4 motif (RPKK) and a pat7 motif (PKKEKKL) 
(Figure 1A). NetNES predicted a potential CRM-1-depend- 
ent NES between residues 291 and 298 (_LOLDGLHL) (also 
in region 3) (Figure 1A). To investigate whether these and 
other unknown signals operated to determine the subcel- 
lular trafficking of N protein, we divided the protein into 
three regions [based on conservation between IBV strains 
and other coronavirus N proteins (38)] and combinations of 
regions 1 plus 2 and 2 plus 3 and cloned these down- 
stream of enhanced cyan fluorescent fusion protein 
(ECFP) generating plasmids pECFP-IBVNR1+2, pECFP- 
IBVNR2+3, pECFP-IBVNR1, pECFP-IBVNR2 and pECFP- 
IBVNR3. When expressed In cells, these would lead to the 
synthesis of ECFP fused to regions 1 plus 2, 2 plus 3 and 
expressed individually, regions 1, 2 and 3, respectively 
(Figure 1B). Vero cells, a model cell line to study IBV-cell 
interactions (24,26,41,42), were transfected with plasmids 
pECFP-IBVNR1+2, pECFP-IBVNR2+3, pECFP-IBVNR1, 
pECFP-IBVNR2 and pECFP-IBVNR3. Recombinant fusion 
proteins were imaged at 24 h post-transfection using live 
cell imaging (direct tluorescence). As a control, cells were 
also transfected with pECFP-C1, which leads to the 
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expression of ECFP only and also wild-type N protein 
cloned down stream of EGFP, plasmid pEGFP-IBVN, as 
described (25) (Figure 2). In a parallel series of experi- 
ments, cells were cotransfected with pECFP-IBVNR1+2, 
pECFP-IBVNR2+3, pECFP-IBVNR1, pECFP-IBVNR2 and 
pECFP-IBVNR3 and pEGFP nucleolin (Figure 2). This latter 
construct allowing the expression of a nucleolar marker 
protein, nucleolin, tagged to EGFP (25). Dual-transfected 
cells were fixed at 24h post-transfection for confocal 
analysis by direct fluorescence. 


Live cell imaging indicated that as previously shown, ECFP 
localized to both the cytoplasm and nucleus, but not the 
nucleolus, whereas EGFP-IBVN protein localized to both 
the cytoplasm and nucleolus but not the nucleus, as 
described previously (23,25). ECFP-IBVNR1+2 protein 
localized to the nucleus and nucleolus, whereas ECFP- 
IBVNR2+3 protein was predominantly cytoplasmic in loca- 
lization. Confocal analysis of dual-transfected cells was 
used to confirm the presence of the nucleolus using the 
marker protein, EGFP nucleolin, and reflected the live cell 
imaging results (Figure 2). The data also indicated that 
ECFP-IBVNR1+2 colocalized with nucleolin but not 
ECFP-IBVNR2+3. Live cell analysis of the single region 
constructs (9ECFP-IBVNR1, pECFP-IBVNR2 and pECFP- 
IBVNR3) indicated that ECFP-IBVNR1 localized predomin- 
antly to the nucleolus and also localized to the nucleus, 
ECFP-IBVNR2 localized predominantly to the nucleus and 
appeared also to accumulate in the nucleolus to the same 
level as the nucleus, whereas ECFP-IBVNR3@ localized pre- 
dominantly to the cytoplasm. Confocal analysis confirmed 
these findings and indicated that ECFP-IBVNR1 coloca- 
lized with nucleolin, whereas ECFP-IBVNR2 was indeter- 
minate and ECFP-IBVNR3 did not (Figure 2). 


This data suggested that a potential NoRS could be 
located in region 1, that region 2 contained a potential 
NLS(s) not identified by the bioinformatic analysis and 
also a possible NoRS. The data also suggested that 
because region 3 was directed to the cytoplasm, the 
putative NES was dominant to the predicted NLSs. None 
of these fusion proteins had a distribution similar to ECFP 
only. Therefore the potential of regions 1 and 2 to direct 
nucleolar localization and region 3 to promote nuclear 
export was investigated. 


Delineation of a NoRS in region 1 of N protein 

Having identified that region 1 of IBV N protein localized to 
the nucleolus, the hypothesis was tested that It contained 
a NoRS to target the protein to the nucleolus. To investi- 
gate this prediction, a series of expression constructs 
containing fragments of region 1 were constructed, 
based on the rationale of keeping sequences of basic 
and non-basic amino acids discrete, DECFP-IBVNR1,_<50, 
DECFP-IBVNR15;_13900 and pECFP-IBVNR149;_132 (Figure 
3A). Vero cells were transfected with these constructs 
and analyzed by live cell imaging at 24 h post-transfection. 
The data indicated that ECFP-IBVNR1,_59 and ECFP- 
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IBVNR1491;_132 had a similar localization pattern to ECFP 
only, whereas ECFP-IBVNR151_100 localized predomin- 
antly to the nucleus and nucleolus (Figure 3B). These 
data were confirmed by cotransfecting cells with either 
pECFP-IBVNR1 1—50: pECFP-IBVNR1 51-100 OF pECFRs 
IBVNR1491;_132, and pEGFP-nucleolin, fixed at 24 h post- 
transfection and analyzed using confocal microscopy 
(Figure 3B). The data also indicated that ECFP-IBVNR151_ 190 
colocalized with nucleolin, whereas the other two fusion 
proteins did not. Therefore, the fragment containing amino 
acids 51-100 of IBV N protein could direct an exogenous 
protein to the nucleolus. 


To further refine the amino acids involved in nucleolar 


retention, 20 amino acid overlapping motifs encompass- 
ing amino acids 61-100 were cloned downstream of 
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MASGKAAGKT DAPAPVIKLG GPKPPKVGSS GNASWFOAIK AKKLNTPPPK 
51 FEGSGVPDNE NIKPSQOHGY WRRQOARFKPG KGGRKPVPDA WYFYYTGTGP 
101 AADLNWGDTQ DGIVWVAAKG ADTKSRSNOG TRDPDKFDQY PLRFSDGGPD 
151 GNFRWDFIPL NRGRSGRSTA ASSAAASRAP SREGSRGRRS DSGDDLIARA 
201 AKIIQDQOKK GSRITKAKAD EMAHRRYCKR TIPPNYRVDO VEFGPRT 
251 GNFGDDKMNE EGIKDGRVTA MLNLVPSSHA CLFGSRVTPK 
301 EFTTVVPCDD POFDNYVKIC DOCVDGVGTR PKDDEPKPKS 
KOD DEADKALTSD EERNNAQLEF YDE 


Coronavirus Nucleocapsid Protein Trafficking 


Figure 1: (A) Amino acid sequence 
of IBV (Beaudette strain) N protein 
with the positions of the predicted 
NES and NLSs indicated. (B) Block 

RTKGKE diagram detailing the 409 amino acid 
ILOLDGLHUR F length of N protein C-terminally fused 
RSSIRPATRG to EGFP and the three regions of N 
protein fused C-terminally to ECFP as 
used in this study. The putative NLSs 
and NESs are indicated. 
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ECFP, creating plasmids pECFP-IBVNR16;_89, PpECFP- 
IBVNR13,_99 and pECFP-IBVNR1.;_109 for the expres- 
sion of recombinant fusion proteins (Figure 4A). Amino 
acids 51-60 were not included in this analysis as there 
were no arginine and lysine residues present in this 
sequence and basic amino acids form part of known 
NoRSs. Vero cells were transfected with pECFP- 
IBVNR161_989, DECFP-IBVNR17,_99 and pECFP-IBVNR1¢91_ 109 
and analyzed 24h post-transfection using live cell ima- 
ging. Also, cells were cotransfected with these constructs 
and pEGFP-nucleolin, fixed at 24 h post-transfection and 
analyzed using confocal microscopy (Figure 4B). The data 
indicated that ECFP-IBVNR1 7;_99 and ECFP-IBVNR1.61_80 
localized to the nucleus and nucleolus (and colocalized 
with nucleolin). However, ECFP-IBVNR1931_199 localized 
to the cytoplasm and nucleus but not the nucleolus 
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(and did not colocalize with nucleolin). Therefore, the 
amino acids at positions 61-90 in IBV N protein were 
able to direct an exogenous protein to the nucleus/ 
nucleolus. 


To further define the amino acids involved in nucleolar traf- 
ticking, we conducted a tetra-alanine substitution muta- 
genesis of amino acids 71-90. These were placed down 
stream of ECFP, creating expression plasmids, pECFP- 
IBVNR17;wearRo—AAAA, pECFP-IBVNR 175arrK.AAAA, pECFP- 
IBVNR1 7O9PGKG—AAAA: pECFP-I BVNR1 83GRKP>AAAA and 
pDECFP-IBVNR1e7vppa_saaaa. Therefore, in some cases, the 
wild-type alanine was not substituted. Amino acids 61-70 
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Confocal analysis 


EGFP-nucleolin 


Figure 2: Live cell imaging show- 
ing the subcellular localization of 
fluorescent fusion proteins; ECFP, 
EGFP-IBV N, ECFP-IBVNR1, ECFP- 
IBVNR2 and ECFP-IBVNR3, ECFP- 
IBVNR1+2 and ECFP-IBVNR2+3 
proteins. Vero cells were visualized 
24 h post-transfection in culture con- 
ditions using a Nikon Eclipse TS100 
microscope. Confocal analysis of 
the subcellular localization of ECFP- 
IBVNR1, ECFP-IBVNR2 and ECFP- 
IBVNR3, ECFP-IBVNR1+2 = and 
ECFP-IBVNR2+3 proteins in cells 
coexpressing EGFP-nucleolin, at 
24h  post-transfection. The IBV 
fusion peptides are coloured green 
and the nucleolin fusion protein 
coloured red. Merged images are 
also presented. Scale bar is 10 um, 
and the nucleolus (No) is arrowed 
where appropriate. 


Merge 


were excluded from the substitution analysis as no basic 
residues were present. These expression plasmids were 
transfected into Vero cells and the distribution of the 
respective fluorescent fusion proteins analyzed by live 
cell imaging at 24 h post-transfection (Figure 5A). The data 
indicated that substituting 71VWVRRO with AAAA (pECFP- 
IBVNR1s;wreo—aaaa) abolished nucleolar retention of the 
recombinant fusion protein and that substitution of 75ARFK 
with AAAA (oECFP-I BVNR1 I5AREK>AAAA) resulted in 
reduced nucleolar retention. The remaining tetra-alanine 
substitutions had no effect on nucleolar retention, indicating 
that amino acids 71VVRROARFK78 were involved in nucleolar 
retention. 
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Figure 3: (A) Block diagram 
detailing the fragments of IBV N 
protein region 1 cloned into 
pECFP-C7. (B) Sub-cellular localization 
of fluorescent fusion proteins ECFP- 
IBVNR1 1(=66, ECFP-IBVNR151_100 and 
ECFP-IBVNR1101_132 in Vero cells 
using live cell imaging and coex- 
pressed with EGFP-nucleolin in 
fixed cells and analyzed by META- 
51-100 confocal microscopy. ECFP and 

EGFP florescence is false coloured 


— ECFP-IBVNR1,,,_435 green and red, respectively. Merged 
images are also presented. Scale 
, l bar is 10 um, and the nucleolus 
(No) is arrowed where appropriate. 
B 
Fluorescent Bright ECEP-Nx EGFP 
nucleolin 
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To test whether this amino acid sequence was involved 
in directing the nucleolar localization of N protein, this 
motif was deleted in the context of full length N protein 
tagged to EGFP (plasmid pEGFP-IBVNanors). This plas- 
mid was transfected into Vero cells and the subcellular 
localization of the resulting fusion protein EGFP- 
IBVNanors Investigated using live cell imaging. There 
was no nucleolar localization at 24h post-transfection 
(Figure 5B, several examples are shown) compared 
with approximately 50% in cells expressing EGFP-IBV 
N protein (data not shown). This data also indicated 
that the NoRS identified in region 1 was necessary for 
nucleolar retention in IBV N protein. As described, the 
preliminary investigation using the single and double 
region constructs indicated a potential NoRS in region 2. 
If the latter signal was functional, then we would have 
expected a proportion of cells expressing EGFP- 
IBVNanors in the nucleolus. However, no nucleolar local- 
ization was observed, indicating that the VWWRROARFK 
motif was necessary for nucleolar localization. 
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To determine whether the eight amino acid sequence 
identified in region 1 was sufficient to direct nucleolar 
retention, this motif was placed downstream of ECFP 
and DsRed, creating vectors pECFP-VVRROARFK and 
oDsRed-VWVRROARFK, respectively. As controls, an amino 
acid sequence C-terminal of the region 1  NORS, 
GRKPVPDA, identified in the tetra-alanine substitution 
mutagenesis as not being involved in nucleolar retention, 
was placed down stream of ECFP, creating vector pECFP- 
GRKPVPDA. Transfection of Vero cells with pECFP- 
WRROARFK and pECFP-GRKPVPDA indicated that 
WRROARFK directed ECFP to the nucleolus and nucleus, 
whereas GRKPVPDA did not (Figure 6A). In the case of 
the former construct, relative fluorescence revealed that 
there was approximately fourfold more ECFP in the 
nucleolus than in the nucleus. To investigate the role of 
both WRROQO and ARFK in nucleolar targeting, each motif 
was substituted for alanine and placed downstream of 
ECFP, creating expression vectors pECFP-VWWRROAAAA 
and pECFP-AAAAARFK. Transfection and expression of 
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these constructs in Vero cells (Figure 6A) indicated that 
whilst VVRRQ could direct ECFP to the nucleus and 
nucleolus, ARFK was required for efficient nucleolar reten- 
tion. In contrast, ECFP-AAAAARFK by itself localized pre- 
dominately to the cytoplasm, weakly to the nucleus and 
was absent from the nucleolus (Figure 6A). 


To investigate whether the WRROARFK motif targeted a 
specific part of the nucleolus, cells were cotransfected 
with pDsRed-VWWRROARFK and either pEGFP-nucleolin or 
pEGFP-fibrillarin (Figure 6B, several examples are shown). 
These fusion proteins provided distinct markers for the 
nucleolus (25). The data indicated that VWVRROARFK could 
also direct DsRed to the nucleolus (as well ECFP, 
Figure 6A) and that WWRROARFK tagged to the appropriate 
florescent protein colocalized predominately with EGFP- 
fibrillarin and EGFP-nucleolin and formed a punctate 
appearance in the nucleolus, which we tentatively defined 
as the DFC. Note that DsRed-VVRROARFK was. also 
observed in the nucleus and cytoplasm, but the images 
are resolved in the linear range for the nucleolar signal, 
which is the predominant component of this localization. 


Investigation of potential nucleolar targeting signals 
in IBV N protein region 2 

Similar to the approach used to identify the NoRS in region 1, 
region 2 was subdivided into two distinct components. 
Amino acids 133-200 and 201-265 were placed 
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Figure 4: (A) Amino acid 
sequence of IBV N protein region 
1 overlapping peptides placed C- 
terminal of ECFP and (B), analysis 
of the subcellular localization of 
these peptides (ECFP-IBVNR1¢1_30, 
ECFP-IBVNR17;_99 and ECFP- 
IBVNR1g1_100) using live cell 
imaging and META-confocal 
microscopy of Vero cells 24h 
post-transfection. ECFP and 
EGFP florescence is shown as 
green and red, _ respectively. 
Merged images with ECFP false 
coloured red are also presented. 
Scale bar is 10 um, and the nucleo- 
lus (No) is arrowed where 
appropriate. 
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downstream of ECFP, creating expression vectors 
DECFP-IBVN, 33-209 and PECFP-IBVN,33_2099 (Figure 7A). 
Expression of these plasmids in Vero cells indicated that 
amino acids 133-200 directed ECFP to the cytoplasm and 
nucleus and had a subcellular localization similar to ECFP. 
In contrast, amino acids 201-265 directed ECFP to the 
nucleus with no evidence of nucleolar exclusion. Further 
investigation revealed that amino acids 201-220 and 
211-230 when fused to ECFP directed this protein to the 
nucleus and nucleolus, whereas amino acids 221-240 
directed ECFP to the nucleus but not to the nucleolus 
(Figure 7B). Relative fluorescence indicated that the ratio 
of ECFP between the nucleus and nucleolus with amino 
acids 201-220 and 211-220 was approximately 1:1, which 
is in contrast to that observed with the NoRS identified in 
region 1, in which there was four times more signal in the 
nucleolus than the nucleus. Taken together with the lack 
of nucleolar retention observed in cells expressing the 
region 1 NoRS deletion mutant, in the context of full 
length N protein, we propose that VWWRROARFK is neces- 
sary and sufficient to direct N protein to the nucleolus, and 
no other signals are involved. 


Comparison of the IBV N protein NoRS with known 
cellular and viral NoRS 

To investigate whether our identified NoRS was unique or 
had a mammalian cellular or viral equivalent, we con- 
ducted a search using the basic local alignment search 
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A ECFP-N17;,WRRQARFKPGKGGRKPVPDAgo 


ECFP-IBVNR17;wrresaAaa 
AAAAARFKPGKGGRKPVPDA 


ECFP-IBVNR175aRFK >AAAA 
WRRQAAAAPGKGGRKPVPDA 


ECFP-IBVNR179pqxe_>AAAA 
WRRQARFKAAAAGRKPVPDA 


ECFP-IBVNR1espqxe>AAaa 
WRRQARFKPGKGAAAAVPDA 


ECFP-IBVNR1g7yppa-saaaa 
WRRQARFKPGKGGRKPAAAA 


tool (BLAST) (http://www.ncbi.nim.nih.gov/BLAST/). No 
known mammalian cellular or non-avian coronavirus viral 
protein sequences were highlighted. Using AlignX, a con- 
sensus NoRS was derived by comparing the IBV N protein 
NoRS with known cellular and viral NoRS which have been 
shown to target exogenous proteins to the nucleolus 
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Figure 5: (A) Amino acid sequence 
of tetra-alanine substitution muta- 
genesis of pECFP-NR1,7,_99, the 
position of the substituted amino 
acids is underlined. In some cases, 
the substitution maintained the 
wild-type alanine and analysis of the 
subcellular localization of these pep- 
tides (ECFP-IBVNR171wrro_sAAAA, 
ECFP-IBVNR17s5arekaaaa,  ECFP- 
IBVNR179epcKxe_aaaa, ECFP-IBVNR1 
83GRKP—>AAAA and ECFP-IBVNR1 
87VPDA>AAAA) USING live cell imaging 
of Vero cells at 24 h post-transfec- 
tion. (B) Live cell imaging of cells 
expressing EGFP-IBVNanors and 
corresponding bright field images. 
Several examples are shown. Scale 
bar is 10 um, and the nucleolus (No) 
is arrowed where appropriate. 


(Figure 8). The consensus sequence was composed of 
two basic groups, both of which were arginine rich. 
Compared with this consensus sequence, analysis 
revealed that the arg72 and arg/73 of the IBV N protein 
NoRS were conserved, arg/76 and lys78 were similar and 
GIn74 were weakly similar. 
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Delineation of a leucine-rich NES in IBV N protein 

Bioinformatic analysis of N protein indicated that a pre- 
dicted NES was located between amino acids 291-298, 
LOLDGLHL. To test whether this motif functioned as an 
NES, these amino acids were deleted in the context of 
wild-type N protein-fused C-terminal of ECFP, creating 
plasmid pECFP-IBVN,j291_298. AS a control, amino acids 
268-275 (VIAMLNLV), encompassing a _ hydrophobic 
region, were deleted in the context of wild-type N pro- 
tein-fused C-terminal of ECFP, creating plasmid pECFP- 
IBVNazeg_275. Vero cells were transfected with these 
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Figure 6: (A) Sub-cellular _local- 
ization of fluorescent fusion 
proteins ECFP-WRROARFK, ECFP- 
GRKPVPDA, ECFP-AAAAARFK and 
ECFP-WRROAAAA in Vero cells 
using confocal microscopy. (B) 
Analysis of colocalization of DsRed- 
WRROARFK with EGFP-fibrillarin and 
EGFP-nucleolin. IBV fusion peptides 
were false coloured green and nucleo- 
lar markers proteins in red. Shown is 
the nucleus and nucleolus. Merged 
images are also presented. Scale bar 
is 10 um, and the nucleolus (No) is 
arrowed where appropriate. 


plasmids and analyzed by live cell imaging at 24 h post- 
transfection. The data indicated that the predicted NES 
deletion mutant (ECFP-IBVN,j291_298) localized predomi- 
nately to the nucleus and nucleolus, whereas the control 
deletion (ECFP-IBVNa z6g_275) had no apparent effect on 
the subcellular localization of the protein (Figure 9A) when 
compared to wild-type N protein-fused C-terminal of EGFP 
(Figures 2 and 9A) (23,25,43). 


Given that the bioinformatic analysis indicated that 
region 3 contained potential NLSs, we _ investigated 
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Figure 7: (A) Block diagram 
detailing fragments of IBV N 
protein region 2 cloned into 
pECFP-C1. Sub-cellular localization 


Rone MISe of ECFP-IBVNR2133_200 and ECFP- 
i IBVNR250, —965 in Vero cells imaged 
ECFP-IBVNR2,33_000 using live cell microscopy. (B) 


Confocal analysis of the subcellular 
localization of IBV _ peptides 
(detailed)-fused C-terminal of ECFP 
in Vero cells. The transmission 
phase contrast image is also pre- 
sented. Scale bar is 10 um, and the 


ECFP-IBVNR2294_900 
AKIIQDQQKKGSRITKAKAD 


GSRITKAKADEMAHRRYCKR 


ECFP-IBVNR2554_549 
EMAHRRYCKRTIPPNYRVDQ 


whether the NES was dominant to these signals. The 
above NES deletion was made in the context of the 
region 3. fusion. protein, pECFP-IBVNR3, creating 
plasmid pECFP-IBVNR3,;j291_29g8 Tor the expression of 
recombinant fusion protein. Live cell imaging of this 
protein in Vero cells at 24 h post-transfection indicated 
that when the NES was deleted (Figure 9B) region 3 
had a similar localization pattern to ECFP only (Figure 2), 
in that the fragment localized to the nucleus and cyto- 
plasm but not nucleolus. This data indicated that 
despite containing two predicted NLSs, region 3 did 
not accumulate in the nucleus or nucleolus, as would 
be predicted if it contained such active signals. 
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nucleolus (No) is arrowed where 
appropriate. 


To investigate the relative importance of leucine residues 
in nuclear export, appropriate alanine substitutions were 
made either in the context of the NES in region 3 only 
or in the context of wild-type N protein. To determine 
whether residues 293L and 298L were involved in nuclear 
export, these positions were substituted for alanine both 
individually and together in the context of ECFP fused 
to region 3 (pECFP-IBVNR3), creating plasmids, pECFP- 
IBVNR8593,_.a, PECFP-IBVNR8599,-.4 and  pECFP- 
IBVNR&8293/298_—a (respectively). These plasmids were 
transfected into Vero cells and the subcellular localization 
of the resulting fusion proteins analyzed by direct flo- 
rescence using live cell imaging (Figure 9B). The data 
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(1) 1 10 24 

NoRS IBV N protein(1) --------- WRRO, aaaemian 
earning-associated protein 1-19(1) -MAKSIRSKHRROMRMMKRE— 
NoRS HIV-1 Tat(1) ------- RKKRRORRRAHO-- 

NoRS (GGNNV) protein alpha(1) -~---------RRRANNRRR-— 
NoRS angiogen(1) -------- IMERRSL——<= 

NoRS HSV gammaz1 34.5(1) -----MARRRRHRGPRRPRPP 
NoRS HIV-1 rev(1) -----— RRNRRRRWREROROT 

NoRS Fibroblast growth factor-2(1) —---RSRKYTSWYVALKR-—— 


NoRS survivin-deltaEx3(1) lis, Asean omen sae 


Consensus 


NoRS MDM2(1) ----------- KKLKKRNK-— 
NoRS NF-kappa(1) ~~7~~~~~——— RKKRKKK-——— 
Nuclear VCP-like protein (NVL2)(1) ~~--~KRKGKLKANKGSKRKK-— 
NoRS p120(1) ~~SKRLSSRARKRAAKRRLG— 
NoRS HIC p40(1) <LANE'PG 
NoRS herpes/mareks MEQ(1) 
(1) 


Figure 8: AlignX analysis of the IBV N protein NoRS with known cellular and viral NoRSs which can target an exogenous 
protein to the nucleolus. Conserved amino acids are shaded blue, similar amino acids shaded in green and weakly similar amino acids in 
green font. The cellular and viral NoRSs are described in NoRS Aplysia learning-associated protein (68), NoRS HIV-1 Tat (69), NoRS 
(GGNNYV) protein alpha (70), NoRS angiogen (71), NoRS HSV gamma 34.5 (21), NoRS HIV-1 rev (72), NoRS fibroblast growth factor-2 


(14), NoRS survivin-deltaEx3 (13), NoRS MDM2 (73), NoRS NF-kappa (74), NoRS nuclear VCP-like protein (NVL2) (75), NoRS p120 (22), 


NoRS HICp40 (76) and NoRS herpes/mareks MEQ (77). 


indicated that none of these changes affected the distri- 
bution of the fusion protein, Suggesting that these amino 
acids were not involved in nuclear export. Amino acids 
291L was substituted for alanine in the context of EGFP- 
IBVN, creating plasmid pEGFP-IBVNo91,_,,. Expression of 
this fusion protein in Vero cells and analysis using relative 
fluorescence indicated an increased level of N protein in 
the nucleus (Figure 9C) when compared with expression 
of the wild-type N protein (Figure 2), thus suggesting that 
position 291L Is involved in nuclear export. This is in con- 
trast with EGFP-IBVNo93,_.4 expressed in Vero cells 
where there is no apparent difference in the localization 
to wild-type N protein (Figure 2). 


Discussion 


NoRSs are not well characterized, and we have made use 
of the avian IBV coronavirus N protein to study these. To 
investigate whether IBV N protein contained a NoRS, 
initially the protein was expressed as a series of single 
and overlapping regions. This preliminary analysis indi- 
cated that IBV N protein contained a NoRS in region 1. 
Deletion mutagenesis delineated a 20 amino acid motit 
that modulated nucleolar retention. Subsequent 
tetra-alanine substitution mutagenesis highlighted that 
four residues were crucial for the targeting function, 
71WRRQ, with residues 75ARFK also promoting 
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retention. The role of this novel octa-peptide in nucleolar 
localization was confirmed by deletion mutagenesis In the 
context of full length N protein and by placing the motif C- 
terminal of ECFP and DsReD. These latter constructs also 
colocalized with nucleolin and fibrillarin, suggesting that 
the WRROARFK motif directs a protein to the DFC. 
Certainly, IBV N protein has been shown to localize to 
the DFC (48). 


This is the first description of a defined NoRS In a corona- 
virus N protein which localizes to the nucleolus. Although 
region 2 could localize ECFP to the nucleolus, the NoRS 
identified in region 1 resulted in nucleolar accumulation of 
fluorescent fusion proteins when comparing the ratio of 
protein in the nucleus versus the nucleolus. Region 2, 
although containing no predicted NLSs (and subcellular 
localization motifs), localized predominately to the 
nucleus. When region 2 was fused to region 3, ECFP 
localized predominately to the cytoplasm, suggesting 
that if a NLS Is present, it is submissive to the NES in 
region 3. 


Studies have suggested that phosphorylation can 
control nucleolar retention of certain proteins (44). 
However, mass spectroscopic analysis revealed that 
no phosphorylated amino acids are present in region 1 
(37), and therefore, we hypothesize that phosphoryla- 
tion plays no role in the specific nucleolar targeting 
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activity. However, in full length, protein conformational 
changes may be induced by phosphorylation and/or 
cleavage to expose relevant motifs to direct the protein 
to appropriate subcellular localizations. Bioinformatic 
analysis found that some of the most abundant motifs 
within the nucleolar proteome were the RNA-recogni- 
tion motif and the DEAD/H box helicase domain (6). 
Although the non-phosphorylated form of IBV N protein 
binds cellular RNA with high affinity (837), this protein 
contains no known cellular RNA-binding motifs. In addi- 
tion, phosphorylated N protein (which is present inside 
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Figure 9: (A) Live cell imaging 
showing the subcellular localiza- 
tion of EGFP-IBVN compared 
with EGFP-IBV N fusion protein 
with a deleted NES (EGFP- 
IBVN,j291_-298) and a control dele- 
tion (EGFP-IBVN,268_275)- (B) Live 
cell imaging showing the subcellu- 
lar localization of the NES knockout 
(ECFP-IBVNR3,j291_298) and speci- 
fic alanine substitution mutants in 
the context of the ECFP IBV N 
region 3 fusion proteins. Bright 
field images are also presented. 
(C) Confocal analysis of selected 
alanine substitution mutants in the 
context of EGFP-IBV N protein. The 
transmission phase contrast image 
is also presented. Scale bar is 
10 um, and the nucleolus (No) is 
arrowed where appropriate. 


EGFP-IBVN 


ECFP- 
IBVNR38.93/298L 54 


the cell) has low affinity for non-viral RNA (87). 
Sequence analysis reveals that region 3 of IBV N pro- 
tein contains the amino acid sequence 3/71DEAD. 
However, the helicase activity of this protein is 
unknown. In addition, region 3 does not localize to the 
nucleolus, even when the NES was deleted. Also, other 
tri-peptide or tetra-peptide motifs containing either GR 
or RG motifs are over-represented in the nucleolar pro- 
teome (6) but not represented in the IBV N protein 
sequence. Therefore, we propose that VWIRRROARFK 
motif acts exclusively as the NoRS for IBV N protein. 
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The three dimensional structure of IBV N protein region 1 
was modelled based on the solution structure of the equi- 
valent region of SARS-CoV N protein (45). Amino acid 
sequence alignment between the two regions (Figure 10A) 
and superimposition of the model on the structure of SARS- 
CoV N protein (Figure 10B) indicated that a credible mole- 
cular model of IBV N protein region 1 could be generated by 
comparative modelling methods (Figure 10C). Coupled to 
the tetra-alanine amino acid substitution mutagenesis 
analysis and subcellular localization of ECFP-VVRROARFK 
and DsRed-VVRROARFK to the nucleolus and comparison 
with cellular and viral NoRSs, the data indicated that Arg72 
and Arg73 were present at the bottom of a pocket and were 
thus accessible for interaction with cellular factors which 
may be involved in nucleolar targeting/retention. 


Although the focus of this study was to elucidate whether 
IBV N protein contained a NoRS, a functional NES was also 
identified in region 3. Interestingly, region 3 also contained 
two predicted NLSs, which could either have been submis- 
sive to the NES or not functional. Such basic regions are 
found in the C-terminal regions of other coronavirus N pro- 
teins and in the case of SARS-CoV proposed to act as an 
NLS (46). However, subsequent experimental data showed 
that this sequence was non-functional (25,47). Deletion 
mutagenesis of the IBV NES in the context of a region 3 
peptide again suggested that the region 3 NLS was non- 
functional in IBV. Our current and previous studies (23,24) 
indicated that by fusing EGFP/ECFP with IBV N protein 
increased the molecular weight of this protein above the 
size exclusion limit of the nuclear pore complex. The fusion 
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peptides generated in this study are close to the size exclu- 
sion limit (50-60 kDa) (9). The region 3 fusion protein 
(~43 kDa) is below the size exclusion limit and could theo- 
retically diffuse into and out of the nucleus. However, as 
discussed above, the peptide localized to the cytoplasm, 
again indicating the presence of a functional NES. 


Similar to other RNA viruses, IBV replication is error prone 
and the genome also undergoes recombination (48,49). 
Therefore, due to selection pressure, there is sequence 
variability causing the formation of different IBV strains 
(50), which has led to different clinical outcomes, such 
as predominately respiratory or nephropathogenic disease 
(34). This variability is reflected in the amino acid sequence 
of the N protein, with presumably essential amino acids 
remaining unchanged or being conserved. Comparison of 
the Beaudette strain N protein amino acid sequence (used 
in this study) with nine other strains of IBV including a 
nephropathogenic strain (accession numbers are shown In 
square brackets) Ark99 [M85244], DEO72 [AF203001], 
M41 [M28566], N2/75 [U52598], N1/62 [U52596], N9/74 
[U52597], OXIBV [AF199412], KB8523 [M21515] and LX4 
[AAQ21592] revealed that the essential NoRS 71WRRQO 
motif was identical in all strains, as was amino acid 78K. 
With regard to the NES located between amino acids 
291-298, amino acids 291L and 296L were identical, 
whereas 293L was not, and 298L was identical in all but 
one strain, where It was substituted for a valine (and thus 
was conserved). Substitution of appropriate leucine resi- 
dues in the NES would support the role of the conserved 
leucines being the functional in nuclear export. 
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Figure 10: (A) Sequence _align- 
ment of IBV N protein region 1 
to SARS-CoV N protein region 1. 
Identical amino acids are indicated 
by an asterisk and the position of 
the tetra-alanine substitutions on 
IBV N_ protein are indicated by 
coloured boxes which correspond 
to their position on the three-dimen- 
sional model. (B) Mapping the mod- 
elled structure of the IBV N protein 
region 1 (red) on the solution struc- 
ture of SARS-CoV protein region 
1 (white). (C) Three-dimensional 
model of region 1 of IBV N protein, 
the position of the tetra-alanine sub- 
stitutions, is indicated by the appro- 
priate colour as shown in (A). 
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Analysis of the SARS-CoV N protein revealed that although it 
contains a potential NoRS(s) which has been mapped to 
region 2 (25), the protein localizes to the cytoplasm only in 
infected cells (25,47). However, it can localize to the 
nucleus/nucleolus when expressed in the absence of other 
viral proteins (51,52), albeit with low frequency when com- 
pared with IBV N protein (25). Unlike IBV N protein, SARS- 
CoV N protein does not possess a recognizable CRM-1- 
dependent NES but instead contains an uncharacterized 
cytoplasmic retention/NES tn region 3 (25). 


Coronaviruses are closely related to arteriviruses, and 
although the arterivirus N protein has no discernable 
homology to the coronavirus N protein, and has a mole- 
cular weight of 15 kDa, several arterivirus N proteins have 
been shown to localize to the nucleolus (53,54). In the 
case of the arterivirus porcine reproductive and respiratory 
syndrome virus (PRRSV) N protein, two NLSs were iden- 
tified, one which directed the protein to the nucleus and 
one which directed the protein to the nucleus and nucleo- 
lus (53). In comparison to the IBV N protein NoRS, the 
PRRSV NoRS was lysine rather than arginine rich. 
Bioinformatic analysis using NetNES predictor revealed 
no obvious NES in this protein, although the equine arter- 
itis virus N protein is sensitive to leptomycin B treatment 
(54). 


An emerging paradigm Is that both plant and animal posi- 
tive strand RNA viruses can interact with nucleus, the 
nucleolus and nucleolar proteins to recruit factors to aid 
in virus replication and/or subvert host cell function (55-63). 
If viral proteins target subnuclear structures such as the 
nucleolus, then they must contain appropriate signalling 
motifs, not only for localization but crucially for export back 
to the cytoplasm. If such proteins were to be retained in 
the nucleolus, then we would predict that virus replication 
would be down-regulated, as the principle site of replica- 
tion for positive-strand RNA viruses is the cytoplasm. 
Disrupting the efficiency of nucleolar localization/nuclear 
export of viral proteins may therefore be a way of attenu- 
ating positive-strand virus replication, whether as part of 
an antiviral strategy or for the design of recombinant vac- 
cines. Certainly, the replication of HIV-1 can be inhibited 
by disrupting the interaction of HIV-1 with the nucleolus 
(64,65). In summary, IBV N protein contains an eight 
amino acid motif which is necessary and sufficient for 
nucleolar retention and a functional NES to traffic the 
protein to the cytoplasm. 


Materials and Methods 


Cell culture 

Vero (monkey-derived kidney epithelial) cells were grown at 37 °C with 5% 
CO, in minimum Eagles media (MEM) supplemented with 10% foetal calf 
serum and penicillin/streptomycin as described previously (24). 
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Construction of plasmids 

The N gene from the Beaudette strain (accession number: AAA46214) of 
IBV served as a template for PCR of region and subregion constructs. 
Primers used incorporated 5-Xhol site and 3-Sacll site for cloning into 
pECFP-C1 (enhanced cyan fluorescent protein) (Clontech, Palo Alto, CA, 
USA). At all times, numbers used in primer or construct names denoting 
amino acid numbers refer to their position on the full length N protein. 
Primers used for the constructs were as follows: pECFP-NR1 (amino acids 
1-133), forward primer GGCCGGTCCTCGAGCCATGGCAAGCGGTAA- 
AGCAGCTGG and reverse primer GACCGGTCCCGCGGCTAATCTCTTG- 
TACCCTGATTGGATC; pECFP-NR2 (amino acids 133-265), forward primer 
GGCCGGTCCTCGAGCCATGGATCCTGATAAGTTTGACCAATA and reverse 
primer GACCGGTCCCGCGGCTAATCCTTAATACCTTCCTCATTCATCT and 
ECFP-NR3 (amino acids 265-409), forward primer GGCCGGTCCTCGAGC 
CATGGGGCGTGTTACAGCAATGCTCAA and reverse primer GACCGGT- 
CCCGCGGCTAAAGTTCATTCTCTCCTAGAGCTGCAT. Double region 
construct (oDECFP-NR2+3) was produced using a combination of the appro- 
priate forward and reverse primers. Double region construct PBECFP-NR1+2 
was constructed as described previously (43). Production of the N1 sub- 
region constructs utilized the following primer combinations: DECFP-NR1_<50, 
forward primer GGCCGGTCCTCGAGCCATGGCAAGCGGTAAA — and 
reverse primer GACCGGTCCCGCGGCTACTTGGGCGGAGG;  pECFP- 
NR151~100, forward primer GGCCGGTCCTCGAGCCATGTTTGAAGGTA- 
GCGGT and reverse primer GACCGGTCCCGCGGCTAAGGTCCTGTTCC 
and pECFP-NR1191_133, forward primer GGCCGGTCCTCGAGCCATGGCC 
GCTGACCTGAAC and reverse primer GACCGGTCCCGCGGCTAATCTCTT 
GTACCCTG. PCR products were purified and subcloned into pCR2.1 TOPO 
vector (Invitrogen, Carlsbad, CA, USA). DNA was purified by alkaline lysis 
(66), and digested using Xhol and Sacll before being ligated into pECFP-C1 
using T4 DNA ligase (Invitrogen), as per the manufacturer's instructions. 


To produce the full-length inserts used for delineation of the NoRS in N1, 
the following oligonucleotides comprising restriction overhangs: (5°, Xhol: 
3°, Sacll) ECFP-N1bg;_g0, forward primer TCGAGCCATGAACATTAA 
GCCAAGCCAGCAACATGGATACTGGAGACGCCAAGCCAGGTTTAAGCCA- 
GGCTAGCCGC and reverse primer GGCTAGCCTGGCTTAAACCTGGCTT 
GGCGTCTCCAGTATCCATGTTGCTGGCTTGGCTTAATGTTCATGGC; ECFP- 
N1b71,_90, forward primer TCGAGCCATGTGGAGACGCCAAGCCAGGTTT 
AAGCCAGGCAAAGGTGGAAGAAAACCAGTCCCAGATGCTTAGCCGC and 
reverse primer GGCTAAGCATCTGGGACTGGTTTTCTTCCACCTTTGCCT 
GGCTTACCTGGCTTGGCGTCTCCACATGGC and ECFP-N1bg;_100, forward 
primer TCGAGCCATGAAAGGT GGAAGAAAACCAGTCCCAGATGCTTGGTA 
CTTTTACTATACTGGAACAGGACCTTAGCCGC and reverse primer GGCT 
AAGGTCCTGTTCCAGTATAGTAAAAGTACCAAGCATCTGGGACTGGTTTT- 
CTTCCACCTTTCATGGC. Tetra-alanine substitution analysis of the region 
N1b7;_99 was also undertaken utilizing full-length oligonucleotides with 5- 
Xhol and 3”-Sacll restriction overhangs: ECFP-N1b7i;wrrosaaaa, |CGAG 
CCATGGCCGCCGCCGCCGCCAGGTTTAAGCCAGGCAAAGGT GGAAGAAA 
ACCAGTCCCAGATGCTTAGCCGC and GGCT AAGCATCTGGGACTGGTT 
TTCTTCCACCTTTGCCTGGCTTAAACCT GGCGGCGGCGGCGGCCATGGC; 
ECFP-N1b7s5areK-aaaa, |1CGAGCCATGTGGAGACGCCAAGCCGCCGCCG 
CCCCAGGCAAAGGTGGAAGAAAACCAGTCCCAGATGCTTAGCCGC — and 
GGCTAAGCATCTGGGACTGGTTTTCTTCCACCTTTGCCTGGGGCGGCGGC 
GGCTTGGCGTCTCCACATGGC; ECFP-N1bygpexe_aaaa, |CGAGCCATGTG 
GAGACGCCAAGCCAGGTTTAAGGCCGCCGCCGCCGGAAGAAAACCAGT 
CCCAGATGCTTAGCCGC and GGC TAAGCATCTGGGACTGGTTTTCTTCCG 
GCGGCGGCGGCCTTAAACCTGGCTTGGCGTCTCCACATGGC; ECEP: 
N1bgserxp—-aaaa, TCGAGCCATGT GGAGACGCCAAGCCAGGTTTAAGCC 
AGGCAAAGGTGCCGCCGCCGCCGTCCCAGATGCTTAGCCGC and GGCT 
AAGCATCTGGGACGGCGGCGGC GGCACCTTTGCCTGGCTTAAACCTGGC 
TTGGCGTCTCCACATGGC and ECFP-N1be7yepa_aaaa, TCGAGCCATGTG 
GAGACGCCAAGCCAGGT TTAAGCCAGGCAAAGGT GGAAGAAAACCAGC 
CGCCGCCGCCTAGCCGC and GGCTAGGCGGCGGCGGCTGGTTTTCTTC 
CACCTTTGCCTGGCTTAAACCTGGCTTGGCGTCTCCACATGGC. 


The octa-peptide nucleolar localization signal (described below) was deleted 
in the context of full-length N protein by overlapping PCR using forward 
primer Jae1 forward GAATTCATGGCAATGGCAAGCGGTAAAGCAGCTGGA 
and reverse primer NoRS reverse TCCACCTTTGCCTGGGTATCCATG 
TTGCTGGC to generate one PCR product and forward primer NoRS forward 
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GCCAGCAACATGGATACCCAGGCAAAGGTGGA = and Jae reverse 
GGATCCTCAAAGTTCATTCTCTCCTAGATGC (25) to generate the second 
PCR product. A second round of PCR was performed using both PCR 
products as templates and Jac1 forward and reverse primers. The resulting 
product was subcloned into pCR2.1, and then the fragment restricted into 
pEGFP-C2 (enhanced green fluorescent protein). 


To investigate whether the octa-peptide was sufficient to direct nucleolar 
localization, the signal and appropriate control peptides were placed 
C-terminal of ECFP and/or DsRed by generating overlapping oligonucleo- 
tides with 5-Xhol and 3°-Sacll restriction overhangs for direct ligation into 
either pECFP-C1 or pDsRed-C1 which had been digested with the appro- 
priate restriction enzymes. VWVRROARFK was placed C-terminal of ECFP 
and DsRed, creating pECFP-WVRROARFK and pDsRed-VWRROARFK, for- 
ward primer TCGAGCCATGTGGAGACGCCAAGCCAGGTTTAAGTAGCCGC 
and reverse primer GGCTACTTAAACCTGGCTTGGCGTCTCCACATGGC. 
The same cloning strategy was used to generate pDECFP-GRKPVPDA (for- 
ward primer TCGAGCCATGGGAAGAAAACCAGTCCCAGATGCTTAGCCGC 
and reverse primer GGCTAAGCATCTGGGACTGGTITTCTTCCCATGGC), 
pECFP-WWRROAAAA (forward primer TCGAGCCATGTGGAGACGCCAA 
GCCGCCGCCGCCTAGCCGC and reverse primer GGCTAGGCGGCGGC 
GGCTTGGCGTCTCCACATGGC) and pECFP-AAAAARFK (forward primer 
TCGAGCCATGGCCGCCGCCGCCGCCAGGTTTAAGTAGCCGC and reverse 
primer GGCTACTTAAACCTGGCGGCGGCGGCGGCCATGGC). 


To investigate potential targeting signals in region 2 of IBV N protein, a 
similar cloning strategy was used. Region 2 was subdivided between 
amino acids 133-200 and 201-265 and placed C-terminal of ECFP. PCR 
primers used were DECFP-IBVNR2133_ 2090 (forward primer 
GGCCGGTCCTCGAGCCATGGATCCTGATAAGTTT and = reverse primer 
GACCGGTCCCGCGGCTATGCACGAGCAAT) and pECFP-IBVNR2291_265 
(forward primer GGCCGGTCCTCGAGCCATGGCAAAGATAATCCAG and 
reverse primer GACCGGTCCCGCGGCTAATCCTTAATACCTTCCTC). 
Potential targeting signals located between amino acids 201-240 were 
further investigated using expression constructs comprised of overlapping 
oligonucleotides to generate amino acid peptides C-terminal of ECFP (as 
described above for region 1): pECFP-IBVNR2.9;_229 (forward primer 
TCGAGCCATG GCAAAGATAATCCAGGATCAGCAGAAAAAGGGCTCT 
CGCATTACCAAGGCAAAGGCAGATTAGCCGC and_ =reverse primer 
GGCTAATCTGCCTTTGCCTTGGTAATGCGAGAGCCCTTTTTCTGCTGATCC- 
TGGATTATCTTTGCCATGGC), pECFP-IBVNR25;;_239 (forward = primer 
TCGAGCCATGGGCTCTCGCATTACCAAGGCAAAGGCAGATGAAATGGCT- 
CATCGCCGGTATTGCAAGCGCTAGCCGC and reverse primer GGCTAG 
CGCTTGCAATACCGGCGATGAGCCATTTCATCTGCCTTTGCCTTGGTAAT- 
GCGAGAGCCCATGGC) and pECFP-IBVNR259;_249 ~=(forward — primer 
TCGAGCCATGGAAATGGCTCATCGCCGGTATTGCAAGCGCACTATCCCA- 
CCTAATTATAGGGTTGATCAATAGCCGC and reverse primer GGCTA 
TTGATCAACCCTATAATTAGGTGGGATAGTGCGCTTGCAATACCGGCAT- 
G AGCCATTTCCATGGC). 


Deletion mutagenesis was undertaken using the Stratagene (La Jolla, CA, 
USA) Quikchange II kit as per the manufacturer's instructions. The following 
primers were used to delete each putative NES site: EGFP-IBVNa, 268-1275 
(control), forward primer GTATTAAGGATGGGCGTCCTAGCAGCCATGCT 
and reverse primer AGCATGGCTGCTAGGACGCCCATCCTTAATAC and 
EGFP-IBVNaz91_29g (NES), forward primer GAAGTAGAGTGACACCC 
AAATTTGAATTTACTACTGTGGTCC and reverse primer GGACCACAG 
TAGTAAATTCAAATTTGGGTGTCACTCTACTTC. 


Alanine substitution mutagenesis of the NES in the context of region 3 
(ECFP-IBVNR3) was undertaken using the Stratagene Quikchange Multi kit 
as per the manufacturer's instructions. Primers used to make the following 
constructs were pECFP-IBVNR3>93;_.,, forward primer GAGTGACA 
CCCAAACTTCAAGCCGATGGGCTTCACTTGAGA and reverse _— primer 
TCTCAAGTGAAGCCCATCGGCTTGAAGTTTGGGTGTCACTC and pECFP- 
IBVNR3.9g._.4, Ttorward primer CTTCAACTAGATGGGCTTCACGCCA 
GATTTGAATTTACTACTGTGGT and reverse primer ACCACAGTAGTA 
AATTCAAATCTGGCGTGAAGCCCATCTAGTTGAAG. pECFP-IBVNR3293/ 
298L_.4 WaS produced using a combination of the above primers as per kit 
instructions. Overlapping PCR as described above was used to substitute 
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291L for A (pDEGFP-IBVN>393,_.a) and 293L for A (9pEGFP-IBVN2931,_,,), In the 
context of EGFP-IBVN. Unique primer combinations were 291L for A (for- 
ward GAAGTAGAGTGACACCCAAAGCCCAACTAGATGGGCTTCACTT and 
reverse AAGTGAAGCCCATCTAGTTGGGCTTTGGGTGTCACTCTACTTC) 
and 293L for A (forward GAGTGACACCCAAACTTCAAGCCGAT 
GGGCTTCACTTGAGA and reverse TCTCAAGTGAAGCCCATCGGCTTG 
AAGTTTGGGTGTCACTC). 


The sequences of all constructs used in this study were confirmed by 
sequencing and where appropriate expression of the resulting fusion pro- 
teins by Western blot (data not shown). 


Live cell imaging 

Vero cells were transfected with 1 wg DNA to 5 wg polyethylenimine (PEI). 
DNA and PEI were mixed in a total volume of 200 nL serum-free 
Dulbecco's modified Eagles media (DMEM) and incubated at 20 °C for 
30 min. Thirty-millimetre cell cultures dishes were seeded with 2 x 105 
cells in MEM as per cell culture methods. Transfection mix was added in 
drop-wise to cells and incubated at 37 °C with 5% COz for 24 h. Live cell 
imaging was performed using a Nikon Eclipse TS100 microscope utilizing 
the appropriate filter for each tag (e.g. Filter B-2A, excitation 450-490 nm 
for ECFP/EGFP). Fluorescence and bright-field images were captured using 
a Nikon Digital Sight DS-L1. 


Confocal microscopy 

Confocal sections of fixed samples were captured on an LSM510 META 
microscope (Carl Zeiss Ltd., Oberkochen, Germany) equipped with a x 40 
and x 63, NA 1.4, oil immersion lens. Pinholes were set to allow optical 
sections of 1mm to be acquired. In singly transfected cells, ECFP was 
excited with the 458 nm argon laser line running at 10%, and emission 
was collected through a BP435-485 emission filter. EGFP was excited with 
the 488 nm argon laser line running at 2%, and emission was collected 
through a LP505 filter. DsRed was excited with the helium : neon 543 nm 
laser line in all cases, and emission was collected through a LP560 filter. 
Due to excitation of the EGFP molecule by the 458 nm argon laser line, 
EGFP and ECFP cotransfected samples were linearly unmixed using the 
META detector. Lambda plots of EGFP and ECFP were generated from 
singly transfected reference samples excited with the 458 nm argon laser 
line and collected with the META detector between 461 and 536 nm, in 
10.7-nm increments. These lambda plots were then utilized to separate, or 
unmix, overlapping emission signal from cotransfected samples. All fluor- 
escence was measured in the linear range as the detector is a photomul- 
tiplier, and the range indicator was utilized to ensure that no saturated 
pixels were obtained on image capture. Images were scanned 16 times. 
No cross-talk between channels was determined by switching off the 
appropriate excitation laser and imaging the corresponding emission (43). 
Relative fluorescent intensity was measured every 0.2 mm and averaged 
for the cytoplasm, nucleus and nucleolus. In this study, all confocal images 
showing IBV N protein or derived peptides and nucleolar marker proteins 
are presented in green and red, respectively (false coloured where appro- 
priate using the Zeiss LSM Image Browser). Colocalization is shown in 
yellow. 


Protein alignment and comparative modelling 

AlignX (VectorNTI, version 9.0) was used to align amino acid sequences for 
generation of a consensus NoRS. The application uses a modified Clustal 
W algorithm. The sequences of the IBV N protein region 1 was searched 
against the Protein Databank and the N-terminal RNA-binding domain of 
SARS-CoV nucleocapsid protein [(45); PDB entry 1ssk chain A] identified as 
a Suitable template structure for comparative modelling (with a BLAST E- 
value of 1 x 10-18 for residues of the protein region 1). The model is 
based on the coordinates of the NMR, minimized average structure. They 
share 38% sequence identity over 137 residues. The sequence alignment 
(Figure 8B) was used to construct a model for IBV N protein region 1 using 
the comparative protein structure modelling program MODELLER (67). 
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